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It has been suggested in voice over IP that an appropriate choice of the distribution used in modeling the delay 
jitters, can improve the play-out algorithm. In this paper, we propose a tool using which, one can determine, at 
a given instance, which distribution model best explains the jitter distribution. This is done using Expectation 
Maximization, to choose amongst possible distribution models which include, the i.i.d exponential distribution, 
the gamma distribution etc. 



1. Introduction 

Voice over IP involves various factors and con- 
straints. One of them being the network con- 
straint. This network overhead is manifest in the 
form of delays in packet arrivals at the destina- 
tion. Typically, the delay is characterized by a 
jitter, meaning that the delay characteristics are 
non-uniform over packet arrivals. 

It has been explored in [2| and £Q that if the 
delay characteristics are captured in an optimal 
model, then, several interesting characteristics 
can be determined. One of them being the com- 
putation of the optimal play-out time. U dis- 
cusses how the jitter is given by the vari- 
ance of the distribution model whereas the 
first order moment gives the mean delay 
spacing between packets. Also 3 discusses 
how the clock-skew can be computed given a dis- 
tribution model for the jitter. 

|3] discusses that there are several plausible dis- 
tribution models. Few have been discussed as in 
P] and pp. In this paper, we give an approach for 
multiplexing between one of these models. 



2. Notations 

Consider the investigated scenario. Two ter- 
minals, A and B communicate over a packet- 
switched network. The actual packet arrival 
times at B is given by sum of the minimum trans- 
mission delay from A to B over the network and 
v(t) where v(t) is random variable characterizing 
the extra delay added by the network (delay jit- 
ter) at time t. 

3. The Approach 

We intend to get a better understanding of the 
arrival process of audio packets at the receiver. 
The objective is to ascertain if the packets arriv- 
ing at a receiver pertain to any well-known dis- 
tribution, such as exponential, or gamma, or geo- 
metric. We note the corresponding distributions 
below : 

[3] models v(t) as an independent, identically 
distributed (i.i.d) random process with exponen- 
tial probability density function. 

f v (v) = u(v)nexp(-[iv) (1) 



where u(v) is the unit step function. 
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On the other hand, the gamma distribution is 
given as 



b a T(a) 



(2) 



The idea that the gamma distribution may 
model the delay jitter distribution, comes by 
closely comparing the p.d.f for the gamma dis- 
tribution with the histogram of the jitter history. 
They match very closely. 

In the case of modeling the delay jitters, various 
authors have suggested different possible models 
for the delays. The intuition we get from this is 
that the network exhibits different behav- 
ior at different times and thereby different 
models for the delay jitters are exhibited 
at different times. All that we can observe 
at the receiver end is therefore only the delay 
jitters. What remains hidden is which statisti- 
cal distribution the jitters correspond to, at the 
present (beginning over last n packets say). Thus, 
we have hidden variables, in the form of indicator 
variables, which suggest, which section of data 
over the recent history corresponds to which dis- 
tribution. 

3.1. The Expectation Maximization Algo- 
rithm 

We will now introduce the Expectation Maxi- 
mization algorithm, better known as the EM al- 
gorithm. 

In many practical learning settings, only a sub- 
set of the relevant instance features might be ob- 
servable. The EM algorithm is a method to do 
maximum likelihood estimation in such a setting. 

The idea behind EM is 

• we have some observed data, but, maximum 
likelihood estimation of our model is com- 
plicated. 

• but maybe if there were some extra vari- 
ables, the maximum likelihood estimation 
in the "augmented space" would be much 
simplified. 

• the extra variables may be hypothetical. 

.Interested readers are referred to [2]. A natural 
application of EM is for missing data problems. 



Therefore, this algorithm provides an effective 
tool to determine from the given data and a given 
set of possible distribution functions, which dis- 
tribution function, each point of the given data 
actually belongs to. 

3.2. The algorithm to determine the cor- 
rect model 

Suppose we have a history file showing the de- 
lay jitter of a certain number of packets in the 
recent past, say Count=30, 000 (which was typi- 
cally the number of packets in the trace-files we 
had for our experiments). Suppose we index these 
packets with j where j varies from I to Count. 
Let v(j) be the observed delay jitter for the j th 
packet in that sequence . Let i correspond to the 
jth moc [ e L For the time being, we will focus only 
on 2 values of i, 1 for exponential distribution and 
2 for gamma distribution. Let Zij be the indica- 
tor variables for the j th packet where i is either 1 
or 2. 
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if the j packet comes 
from the i th distribution; 
otherwise. 



(3) 



Let 



1. a,i = subset of data points from v which 
come from the i th distribution model 

2. a.i = parameter set for the i th distribution 
model 

3- Pi{vk\oLi) = pdf for data point Vk under the 



distribution model 



There are 2 steps for modeling the data 

(I) The E(Expectation) Step Calculate the 
expected value of Zij as 



(4) 



The new values of Zj j £1X6 calculated as 



I if > Z i3 V k / i 



Then estimate a,- 's as 



(5) 
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• (Zj = subset of data points v(j) from v 
for which = 1 

(II) The M (Maximization) Step Find 

the new Maximum Likelihood Esti- 
mates (MLEs) for all the parameter sets 
Ui using standard methods. 

Repeat steps (I) and (II) above till the Zij val- 
ues stabilize. Figure Q shows the pseudocode for 
the proposed algortihm. 



4. Experiments and results 

1. The experiments were performed over trace 
files used in yQ. There were 6 such trace 
files. The EM algorithm was run on the 
data, till the indicator variables stabilized. 
Below we plot the graph of the first set of 
indicator variables z\j. 

2. For the first trace file, the algorithm con- 
verged in a single step. The single plot is 
shown in figure [5] 



1 . /'Read the delay jitter history of N packets*/ 

• X = [x(j)]:j = 1toN 

2. /'Suppose the choices for distribution are functions A and B. 
Initial parameter estimates are [a] for A, [b] for B (obtained 
by assuming the entire history belongs to one distribution)*/ 

3. /* z1(j), z2(j) : indicator variables for A, B respectively*/ 

4. EM algorithm for K iterations till indicator variable stabilizes 

5. For i= 1 to K { 

(a) E step: 

(b) Forj= 1 toN{ 

i- z1(j)= p[x(j)-[a]]/{p[x(j)-[a]] + p[x(j)-[b]]} 

ii. z2(j)=1-z1(j) 

iii. ifz1Q')>=z2Q) 

. z1(j)=1,z2(j)=0 

iv. else 

. z1(j)=0,z2(j)=1 

(c) M step: 

(d) for all x(j) where z1(j)=1 

a = parameter estimate of x(j) 

(e) for all x(j) where z2(j)=1 

b = parameter estimate of x(j) 

6. end for 

7. plot(z1) 

Figure 1. The Pseudo-code 




Figure 2. Graph for trace 1 (generated for packets 
1-56000) 



The figure El shows that in the first half, 
the jitter tends to follow the gamma distri- 
bution, whereas, later on it tends to follow 
the i.i.d exponential distribution. 

The algorithm was then run on smaller win- 
dow of size 3500 which was moved over this 
trace-file. The corresponding trace-files are 
shown in figures [21 E] GO and 

The observation is that when we keep a 
window of size 3500 and move the window 
around, the transition from gamma to ex- 
ponential distribution becomes apparent (in 
fig. around the actual region of change 
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Figure 3. Graph for trace 1 (generated for packets 
0-3500) 



(as in fig. |3J), albeit a little delayed. For the 
regions in which the exponential/gamma 
distribution predominate, this smaller win- 
dow is in complete agreement (gamma for 
fig. Eland exponential as in fig. 0). 

5. Conclusions 

In this paper, we have suggested that we can 
have a way of choosing between alternative mod- 
els for the delay jitters of packets in voice over 
IP. It was found that the jitters indeed follow one 
particular model at a stretch, provided that we 
have the right models which to choose from. How 
the variance and mean of the captured model can 
be used in play out time estimation has already 
been partly dealt with in 

We recommend a middle-tier architecture as 
shown in figure for capturing the jitter distri- 
bution characteristics. The proxy, for instance, 
can monitor over intervals of say 30,000 packets 
(or any duration for which the network charac- 
teristics are expected not to change). The proxy 
can determine after periodic intervals, over a win- 
dow of size, say 3500 packets, the nature of the 
distribution of the delay jitters. The distribution 
type and the corresponding parameters can be in- 
dicated to the receivers through a special field in 
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Figure 4. Graph for trace 1 (generated for packets 
35000-38500) 



the header. This has 3 advantages : one, that 
the individual receivers are not overloaded and 
two, that the proxy can be continuously mon- 
itoring the network, since it receives voice pack- 
ets from for different destinations from different 
sources all the time. Thirdly, this architecture 
greatly reduces the redundancy that is otherwise 
incurred at each of the receivers trying to esti- 
mate the distribution. Finally, this also provides 
a neat way of dynamically updating the receivers 
on the status of the network at any time. 

The question remains, in which layer should 
this task of determining the appropriate model, be 
performed ?. A possible solution is to do this at 
the receiver end, by the VOIP software. The soft- 
ware continuously monitors the packets arriving 
at the receiver end. Periodically, (say, once ev- 
ery 10 seconds- or any time duration, over which 
the network characteristics generally change), it 
takes a history of approximately 30,000 packets 
(if there is no conversation, it can send dummy 
test packets over the network and monitor their 
delay time) and over them, it computes the prob- 
able model for the present. In this way, given that 
a call is received at any instance, the system can 
suggest, based on the history, what model of jitter 
distribution, the network imposes on the current 
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Figure 5. Graph for trace 1 (generated for packets 
39000-42500) 




1000 1500 



conversation. 
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Figure 6. Graph for trace 1 (generated for packets 
42500-46000) 
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Figure 7. The proposed architecture 
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1 . /*Read the delay jitter history of N packets*/ 

• X = [x(j)] : j = 1 to N 

2. /'Suppose the choices for distribution are functions A and B. 
Initial parameter estimates are [a] for A, [b] for B (obtained 
by assuming the entire history belongs to one distribution)*/ 

3. /* z1(j), z2(j) : indicator variables for A, B respectively*/ 

4. EM algorithm for K iterations till indicator variable stabilizes 

5. For i= 1 to K { 

(a) E step: 

(b) Forj= 1 to N{ 

i- z1(j)= p[x(j)-[a]]/{p[x(j)-[a]] + p[x(j)-[b]]} 

ii. z2(j)=1-z1(j) 

iii. if z1 (j) >= z2G) 

• z1(j)=1,z2(j)=0 

iv. else 

. z1(j)= 0, z2(j)= 1 

(c) M step: 

(d) for all x(j) where z1(j)=1 

a = parameter estimate of x(j) 

(e) for all x(j) where z2(j)=1 

b = parameter estimate of x(j) 

6. end for 

7. plot(z1) 



